Procurement auction using actor-critic type learning algorithm
نویسندگان
چکیده
Procurement, the process of obtaining materials or services, is a critical process for any organization. While procuring a set of items from different suppliers who may sell only a subset (bundle) of a desired set of items, it will be required to select an optimal set of suppliers who con supply the desired set of items. This is the optimal uendor selection problem. Bundling in procurement has benefits such as demand aggregation, supplier aggregation, and lead time reduction. The NPhardness of the vendor selection problem motiuates us t o formulate a compatible linear programming problem by relaxing the integer constraints and imposing odditional constraints. The newly formulated problem can be solved by a novel iterative algorithm proposed recently in the literature. In this paper, we show that the application of this iterative algorithm wall lead to an iterative procurement auction that improves the eficiency of the procurement process. By using reinforcement learning to orchestrate the-iterations of the algorithm, we show impressive gains in computational eficiency of the algorithm.
منابع مشابه
An Actor/Critic Algorithm that is Equivalent to Q-Learning
We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...
متن کاملAn Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function
We present an analysis of actor/critic algorithms, in which the actor updates its policy using eligibility traces of the policy parameters. Most of the theoretical results for eligibility traces have been for only critic's value iteration algorithms. This paper investigates what the actor's eligibility trace does. The results show that the algorithm is an extension of Williams' REINFORCE algori...
متن کاملApplying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the “building blocks of movement generation”, called motor ...
متن کاملActor-Critic Control with Reference Model Learning
We propose a new actor-critic algorithm for reinforcement learning. The algorithm does not use an explicit actor, but learns a reference model which represents a desired behaviour, along which the process is to be controlled by using the inverse of a learned process model. The algorithm uses Local Linear Regression (LLR) to learn approximations of all the functions involved. The online learning...
متن کاملA novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives
In this article, we propose an architecture of a bio-inspired controller that addresses the problem of learning different locomotion gaits for different robot morphologies. The modeling objective is split into two: baseline motion modeling and dynamics adaptation. Baseline motion modeling aims to achieve fundamental functions of a certain type of locomotion and dynamics adaptation provides a "r...
متن کامل